Import Libraries and Load Data¶

This block is for importing necessary libraries and loading the COVID-19 deaths and confirmed cases datasets. plotly.graph_objects is used for creating interactive plots, and pandas is used for data manipulation.

In [1]:
import plotly.graph_objects as go
import pandas as pd

# Load the data
deaths_data = pd.read_csv('time_series_covid19_deaths_US.csv')
confirmed_data = pd.read_csv('time_series_covid19_confirmed_US.csv')

Data Aggregation¶

Now, the data is aggregated by U.S. states. Both the deaths and confirmed cases datasets are grouped by the state, and the data is summed up. The slicing (iloc[:, 6:]) adjusts the dataframe to focus on the time-series data, excluding earlier columns that might contain metadata.

In [2]:
# Aggregate data by state and date
state_deaths = deaths_data.groupby('Province_State').sum().iloc[:, 6:]  # adjust index if necessary
state_cases = confirmed_data.groupby('Province_State').sum().iloc[:, 6:]  # adjust index if necessary

Create Interactive Plot with Dropdown Menu¶

This block is for visualization, it initializes a plotly figure and iteratively adds two traces for each state—one for deaths and another for cases. These traces are configured to display the time-series data of COVID-19 spread per state.

The dropdown menu allows users to select specific states to view their data separately or view all states collectively. This feature enhances the interactivity of the visualization, making it easier to analyze the temporal trends and compare between states.

In [3]:
# Create an interactive plot with a dropdown menu
fig = go.Figure()

# Add all traces, initially visible
states = state_deaths.index
for state in states:
    fig.add_trace(go.Scatter(
        x=state_deaths.columns,
        y=state_deaths.loc[state],
        mode='lines',
        name=f"{state} Deaths",
        visible=True  # Initially all are visible
    ))
    fig.add_trace(go.Scatter(
        x=state_cases.columns,
        y=state_cases.loc[state],
        mode='lines',
        name=f"{state} Cases",
        visible=True  # Initially all are visible
    ))

# Update layout with a dropdown menu
fig.update_layout(
    title='COVID-19 Cases and Deaths by State Over Time',
    xaxis_title='Date',
    yaxis_title='Number of Cases/Deaths',
    updatemenus=[{
        'buttons': [
            {
                'label': 'All States',
                'method': 'update',
                'args': [{'visible': [True] * len(fig.data)},
                         {'title': 'COVID-19 Cases and Deaths by State Over Time'}]
            }] + [
            {
                'label': state,
                'method': 'update',
                'args': [{'visible': [s==state for s in states for _ in (0, 1)]*2},
                         {'title': f'COVID-19 Cases and Deaths in {state}'}]  # Update title
            } for state in states
        ],
        'direction': 'down',
        'showactive': True,
    }],
    hovermode='closest'
)

fig.show()

This block is for importing necessary libraries

In [5]:
import plotly.express as px
from datetime import datetime

Cleaning the data by removing irrelevant columns and transposing the DataFrame, so each row represents a date. It then converts the index to a datetime format for easier slicing based on time frames.

In [6]:
# Remove non-date columns for cleaner processing
confirmed_statewise = confirmed_data.drop(columns=['UID', 'iso2', 'iso3', 'code3', 'FIPS', 'Admin2', 'Country_Region', 'Lat', 'Long_', 'Combined_Key']).groupby('Province_State').sum().transpose()
deaths_statewise = deaths_data.drop(columns=['UID', 'iso2', 'iso3', 'code3', 'FIPS', 'Admin2', 'Country_Region', 'Lat', 'Long_', 'Combined_Key', 'Population']).groupby('Province_State').sum().transpose()

# Convert index to datetime to filter by month and year
confirmed_statewise.index = pd.to_datetime(confirmed_statewise.index, format='%m/%d/%y')
deaths_statewise.index = pd.to_datetime(deaths_statewise.index, format='%m/%d/%y')

Now the code filters the COVID-19 data for February 2023 and calculates the daily CFR. It averages the CFR for the month, sorts it, and prepares it for visualization, providing a clear month-end view of the fatality rates by state.

In [7]:
# Extract data for February 2023
february_2023 = confirmed_statewise.index.month == 2
confirmed_february = confirmed_statewise[february_2023]
deaths_february = deaths_statewise[february_2023]

# Calculate daily CFR for February 2023
cfr_february = (deaths_february / confirmed_february) * 100

# Calculate the average CFR for the month for each state
average_cfr_february = cfr_february.mean()

# Sort the states alphabetically by index
average_cfr_february_sorted = average_cfr_february.sort_index()

# Convert to DataFrame for Plotly
cfr_df = average_cfr_february_sorted.reset_index()
cfr_df.columns = ['State', 'Average CFR']

Finally an interactive bar chart displaying the average CFR by state for February 2023. The plot includes hoverable data points showing precise CFR values, offering an intuitive way to explore the data.

In [8]:
# Create an interactive bar chart using Plotly
fig = px.bar(cfr_df, x='State', y='Average CFR',
             title='Average COVID-19 Case Fatality Rate (CFR) by State for February 2023',
             labels={'Average CFR': 'Average CFR (%)', 'State': 'State'},
             hover_data={'State': True, 'Average CFR': ':.2f'})

fig.show()
In [ ]: